[Notes] Simple Recommendation System

Continue to implement get history

Implement getFavoriteItemsmethod

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
/**
* Get the favorite items given userId in the history table
* userId -> itemId -> favorite items
*/
@Override
public Set<Item> getFavoriteItems(String userId) {
if (conn == null) return new HashSet<Item>();
Set<Item> favoriteItems = new HashSet<Item>();
Set<String> itemIds = getFavoriteItemIds(userId);

try {
String sql = "SELECT * FROM items WHERE item_id = ?";
PreparedStatement stmt = conn.prepareStatement(sql);
for (String itemId: itemIds) {
stmt.setString(1, itemId);
// executeQuery returns a table(ResultSet)
ResultSet res = stmt.executeQuery();
ItemBuilder builder = new ItemBuilder();
// iterator
while (res.next()) {
builder.setItemId(res.getString("item_id"));
builder.setName(res.getString("name"));
builder.setAddress(res.getString("address"));
builder.setImageUrl(res.getString("image_url"));
builder.setUrl(res.getString("url"));
builder.setCategories(getCategories(itemId));
builder.setDistance(res.getDouble("distance"));
builder.setRating(res.getDouble("rating"));


favoriteItems.add(builder.build());
}

}
} catch (SQLException e) {
e.printStackTrace();
}
return favoriteItems;
}

Next, let’s try getFavoriteItemIds and getCategories

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
/**
* Given userId, query history table, get corresponding item_id.
*/
@Override
public Set<String> getFavoriteItemIds(String userId) {
if (conn == null) return new HashSet<String>();
Set<String> favoriteItemIds = new HashSet<String>();
try {
String sql = "SELECT * FROM history WHERE user_id = ?";
PreparedStatement stmt = conn.prepareStatement(sql);
stmt.setString(1, userId);
ResultSet res = stmt.executeQuery();
while (res.next()) {
favoriteItemIds.add(res.getString("item_id"));
}
} catch (SQLException e) {
e.printStackTrace();
}

return favoriteItemIds;
}

/**
* Given itemId, query categories table, get corresponding categories
*/
@Override
public Set<String> getCategories(String itemId) {
if (conn == null) return new HashSet<>();
Set<String> categories = new HashSet<String>();
try {
String sql = "SELECT category FROM categories WHERE item_id = ?";
PreparedStatement stmt = conn.prepareStatement(sql);
stmt.setString(1, itemId);
ResultSet res = stmt.executeQuery();
while (res.next()) {
categories.add(res.getString("category"));
}

} catch (SQLException e) {
e.printStackTrace();
}
return categories;
}

Get back to ItemHistory.java, update doGet method.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/**
* Get the user_id from request, query the favorite items, and handle the response
*/
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
String userId = request.getParameter("user_id");
JSONArray array = new JSONArray();

DBConnection conn = DBConnectionFactory.getConnection();
Set<Item> items = conn.getFavoriteItems(userId);
for (Item item: items) {
JSONObject obj = item.toJSONObject();
try {
obj.append("favorite", true);
} catch (JSONException e) {
e.printStackTrace();
}
array.put(obj);
}
RpcHelper.writeJSONArray(response, array);
}

test with http://localhost:8080/Jupiter/history?user_id=1111

Recommendation System

Widely exists in industry and it is a popular interview question.

Facebook: news feed, friends, ‘Do you know this friend?’的例子

LinkedIn: jobs, companies, acquaintances, ‘Are you interested in this job?’的例子

Amazon: 电饭锅的例子

Google Ads: 两颗红豆的例子

Engineering Design

  1. Given a user, fetch all the events (ids) this user has visited. (which table? which API?)

    1
    2
    //history: history_id, user_id, item_id, last_favor_time.
    Set<String> itemIds = getFavoriteItemIds(userId);
  2. given all these events, what are the categories? (which table? which API?)

    1
    2
    //categories: item_id, category.
    Set<String> categories = getCategories(itemId);
  3. given these categories, what are the events that belong to them (which table? which API?)

1
2
//external API
List<Item> items = searchItems(userId, lat, lon, category);
  1. filter events that this user has visited

12. Simple Recommendation System-2019-4-17-11-58-6.png

Code Implementation

Add a new package src/algorithm. Add GeoRecommendation.java.

Idea of this algorithm: First get the current user’s favourite items, then get all categories of this user’s favorites, sort them by descending order. Then get these categories, search items again, sort this items with distance(Ascending order), and add those items to the result. Category > Distance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
public List<Item> recommendItems(String userId, double lat, double lon) {
List<Item> recommendedItems = new ArrayList<Item>();
DBConnection conn = DBConnectionFactory.getConnection();
// 1. Get all favorite itemIds
Set<String> favoriteItemIds = conn.getFavoriteItemIds(userId);

// 2. Get all categories by favorite items, sort by count
// Use a map to store category and its counts
Map<String, Integer> allCategories = new HashMap<>();
for (String itemId: favoriteItemIds) {
Set<String> categories = conn.getCategories(itemId);
for (String category: categories) {
allCategories.put(category, allCategories.getOrDefault(category, 0) + 1);
}
}

// Sort the category
// avoid integer overflow, Use Integer.compare(int, int)
List<Entry<String, Integer>> categoryList = new ArrayList<>(allCategories.entrySet());

// more occurrences first
Collections.sort(categoryList, new Comparator<Entry<String, Integer>>(
) {

@Override
public int compare(Entry<String, Integer> o1, Entry<String, Integer> o2) {
return Integer.compare(o2.getValue(), o1.getValue());
}
});

// 3. do search based on category, filter out favorite events, sort by distance

Set<Item> visitedItems = new HashSet<Item>();
for (Entry<String ,Integer> category: categoryList) {
List<Item> items = conn.searchItems(lat, lon, category.getKey());
List<Item> filteredItems = new ArrayList<Item>();
for (Item item: items) {
// if not visited again and not already in the favorite list, add to recommendation list
if (!favoriteItemIds.contains(item.getItemId()) && !visitedItems.contains(item)) {
filteredItems.add(item);
}
}

// sort the items in filtered items. less distance first
Collections.sort(filteredItems, new Comparator<Item>() {

@Override
public int compare(Item o1, Item o2) {
return Double.compare(o1.getDistance(), o2.getDistance());
}
});

visitedItems.addAll(items);
recommendedItems.addAll(filteredItems);
}

return recommendedItems;
}

Step 2, Go to src/rpc/RecommendItem.java, implement doGet()method.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/**
* Get the lat and lon and current userId, return a list of recommended items
*/
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
String userId = request.getParameter("user_id");
double lat = Double.parseDouble(request.getParameter("lat"));
double lon = Double.parseDouble(request.getParameter("lon"));

GeoRecommendation recommendation = new GeoRecommendation();
List<Item> items = recommendation.recommendItems(userId, lat, lon);

JSONArray result = new JSONArray();
try {
for(Item item: items) {
result.put(item.toJSONObject());
}
} catch (Exception e) {
e.printStackTrace();
}
RpcHelper.writeJSONArray(response, result);
}

Step 3, De-duplicate items in recommendation results.

Since we’re using set to save results from our recommendation function, we need do the de-duplication based on item_id. So go to Item.java, add hashCode() and equals().

Eclipse provides an easy way to add this two methods. Careful, only select itemId when generating these methods.